---
title: Docx files
---

The `DocxLoader` allows you to extract text data from Microsoft Word documents. It supports both the modern `.docx` format and the legacy `.doc` format. Depending on the file type, additional dependencies are required.

---

## Setup

To use `DocxLoader`, you'll need the `@langchain/community` integration along with either `mammoth` or `word-extractor` package:

- **`mammoth`**: For processing `.docx` files.
- **`word-extractor`**: For handling `.doc` files.

### Installation

#### For `.docx` Files

```bash npm
npm install @langchain/community @langchain/core mammoth
```
#### For `.doc` Files

```bash npm
npm install @langchain/community @langchain/core word-extractor
```
## Usage

### Loading `.docx` Files

For `.docx` files, there is no need to explicitly specify any parameters when initializing the loader:

```javascript
import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
  "src/document_loaders/tests/example_data/attention.docx"
);

const docs = await loader.load();
```
### Loading `.doc` Files

For `.doc` files, you must explicitly specify the `type` as `doc` when initializing the loader:

```javascript
import { DocxLoader } from "@langchain/community/document_loaders/fs/docx";

const loader = new DocxLoader(
  "src/document_loaders/tests/example_data/attention.doc",
  {
    type: "doc",
  }
);

const docs = await loader.load();
```
